{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Mix Self-Consistency Notebook \n",
    "\n",
    "<a href=\"https://colab.research.google.com/github/run-llama/llama-hub/blob/main/llama_hub/llama_packs/tables/mix_self_consistency/mix_self_consistency.ipynb\" target=\"_parent\">\n",
    "<img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
    "\n",
    "In this notebook, we highlight the mix self-consistency method proposed in [\"Rethinking Tabular Data Understanding with Large Language Models\"](https://arxiv.org/pdf/2312.16702v1.pdf) paper by Liu et al.\n",
    "\n",
    "LLMs can reason over tabular data in 2 main ways:\n",
    "1. textual reasoning via direct prompting\n",
    "2. symbolic reasoning via program synthesis (e.g. python, SQL, etc)\n",
    "\n",
    "The key insight of the paper is that different reasoning pathways work well in different tasks. By aggregating results from both with a self-consistency mechanism (i.e. majority voting), it achieves SoTA performance.\n",
    "\n",
    "We implemented the paper based on the prompts described in the paper, and adapted it to get it working. That said, this is marked as beta, so there may still be kinks to work through. Do you have suggestions / contributions on how to improve the robustness? Let us know! "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Download Data\n",
    "\n",
    "We use the [WikiTableQuestions dataset](https://ppasupat.github.io/WikiTableQuestions/) (Pasupat and Liang 2015) as our test dataset.\n",
    "\n",
    "WikiTableQuestions is a question-answering dataset over various semi-structured tables taken from Wikipedia. These tables range in size from a few rows/columns to mnay rows. Some columns may contain multi-part information as well (e.g. a temperature column may contain both Fahrenheight and Celsius)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "!wget \"https://github.com/ppasupat/WikiTableQuestions/releases/download/v1.0.2/WikiTableQuestions-1.0.2-compact.zip\" -O data.zip\n",
    "!unzip data.zip"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's visualize some examples."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>utterance</th>\n",
       "      <th>context</th>\n",
       "      <th>targetValue</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>nt-0</td>\n",
       "      <td>what was the last year where this team was a p...</td>\n",
       "      <td>csv/204-csv/590.csv</td>\n",
       "      <td>2004</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>nt-1</td>\n",
       "      <td>in what city did piotr's last 1st place finish...</td>\n",
       "      <td>csv/204-csv/622.csv</td>\n",
       "      <td>Bangkok, Thailand</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>nt-2</td>\n",
       "      <td>which team won previous to crettyard?</td>\n",
       "      <td>csv/204-csv/772.csv</td>\n",
       "      <td>Wolfe Tones</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>nt-3</td>\n",
       "      <td>how many more passengers flew to los angeles t...</td>\n",
       "      <td>csv/203-csv/515.csv</td>\n",
       "      <td>12,467</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>nt-4</td>\n",
       "      <td>who was the opponent in the first game of the ...</td>\n",
       "      <td>csv/204-csv/495.csv</td>\n",
       "      <td>Derby County</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     id                                          utterance  \\\n",
       "0  nt-0  what was the last year where this team was a p...   \n",
       "1  nt-1  in what city did piotr's last 1st place finish...   \n",
       "2  nt-2              which team won previous to crettyard?   \n",
       "3  nt-3  how many more passengers flew to los angeles t...   \n",
       "4  nt-4  who was the opponent in the first game of the ...   \n",
       "\n",
       "               context        targetValue  \n",
       "0  csv/204-csv/590.csv               2004  \n",
       "1  csv/204-csv/622.csv  Bangkok, Thailand  \n",
       "2  csv/204-csv/772.csv        Wolfe Tones  \n",
       "3  csv/203-csv/515.csv             12,467  \n",
       "4  csv/204-csv/495.csv       Derby County  "
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "\n",
    "examples = pd.read_table(\"WikiTableQuestions/data/training-before300.tsv\")\n",
    "\n",
    "examples.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The question is:  what was the last year where this team was a part of the usl a-league?\n"
     ]
    }
   ],
   "source": [
    "example = examples.iloc[0]\n",
    "question = example[\"utterance\"]\n",
    "context = example[\"context\"]\n",
    "print(\"The question is: \", question)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's load the table that can be used as context to answer the question in the first example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Year</th>\n",
       "      <th>Division</th>\n",
       "      <th>League</th>\n",
       "      <th>Regular Season</th>\n",
       "      <th>Playoffs</th>\n",
       "      <th>Open Cup</th>\n",
       "      <th>Avg. Attendance</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2001</td>\n",
       "      <td>2</td>\n",
       "      <td>USL A-League</td>\n",
       "      <td>4th, Western</td>\n",
       "      <td>Quarterfinals</td>\n",
       "      <td>Did not qualify</td>\n",
       "      <td>7,169</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2002</td>\n",
       "      <td>2</td>\n",
       "      <td>USL A-League</td>\n",
       "      <td>2nd, Pacific</td>\n",
       "      <td>1st Round</td>\n",
       "      <td>Did not qualify</td>\n",
       "      <td>6,260</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2003</td>\n",
       "      <td>2</td>\n",
       "      <td>USL A-League</td>\n",
       "      <td>3rd, Pacific</td>\n",
       "      <td>Did not qualify</td>\n",
       "      <td>Did not qualify</td>\n",
       "      <td>5,871</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2004</td>\n",
       "      <td>2</td>\n",
       "      <td>USL A-League</td>\n",
       "      <td>1st, Western</td>\n",
       "      <td>Quarterfinals</td>\n",
       "      <td>4th Round</td>\n",
       "      <td>5,628</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2005</td>\n",
       "      <td>2</td>\n",
       "      <td>USL First Division</td>\n",
       "      <td>5th</td>\n",
       "      <td>Quarterfinals</td>\n",
       "      <td>4th Round</td>\n",
       "      <td>6,028</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   Year  Division              League Regular Season         Playoffs  \\\n",
       "0  2001         2        USL A-League   4th, Western    Quarterfinals   \n",
       "1  2002         2        USL A-League   2nd, Pacific        1st Round   \n",
       "2  2003         2        USL A-League   3rd, Pacific  Did not qualify   \n",
       "3  2004         2        USL A-League   1st, Western    Quarterfinals   \n",
       "4  2005         2  USL First Division            5th    Quarterfinals   \n",
       "\n",
       "          Open Cup Avg. Attendance  \n",
       "0  Did not qualify           7,169  \n",
       "1  Did not qualify           6,260  \n",
       "2  Did not qualify           5,871  \n",
       "3        4th Round           5,628  \n",
       "4        4th Round           6,028  "
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "table = pd.read_csv(\"WikiTableQuestions/\" + context)\n",
    "table.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The corect answer should be 2004."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load Pack / Setup\n",
    "\n",
    "Now we do `download_llama_pack` to load the Mix Self Consistency LlamaPack (you can also import the module directly if using the llama-hub package).\n",
    "\n",
    "We will also optionally setup observability/tracing so we can observe the intermediate steps."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Option: if developing with the llama_hub package\n",
    "# from llama_hub.llama_packs.tables.mix_self_consistency.base import (\n",
    "#     MixSelfConsistencyQueryEngine,\n",
    "# )\n",
    "\n",
    "# Option: download llama_pack\n",
    "from llama_index.llama_pack import download_llama_pack\n",
    "\n",
    "download_llama_pack(\n",
    "    \"MixSelfConsistencyPack\",\n",
    "    \"./mix_self_consistency_pack\",\n",
    "    skip_load=True,\n",
    "    # leave the below line commented out if using the notebook on main\n",
    "    # llama_hub_url=\"https://raw.githubusercontent.com/run-llama/llama-hub/suo/table_qa/llama_hub\"\n",
    ")\n",
    "from mix_self_consistency_pack.base import MixSelfConsistencyQueryEngine"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.llms import OpenAI\n",
    "\n",
    "llm = OpenAI()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Optional: Setup Observability\n",
    "\n",
    "Here we will use our Arize Phoenix integration to view traces through the query engine."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import phoenix as px\n",
    "import llama_index\n",
    "\n",
    "px.launch_app()\n",
    "llama_index.set_global_handler(\"arize_phoenix\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Run experiments"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's try out different modes."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Textual Reasoning Only (i.e. direct prompting)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's start with using only the textual reasoning path.\n",
    "Basically, we directly convert the pandas dataframe into a markdown representation, and inject into the prompt for in-context reasoning."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "query_engine = MixSelfConsistencyQueryEngine(\n",
    "    df=table,\n",
    "    llm=llm,\n",
    "    text_paths=1,\n",
    "    symbolic_paths=0,\n",
    "    aggregation_mode=\"none\",\n",
    "    verbose=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[1;3;38;2;155;135;227m> Running module c6ffdfda-15f3-4a4c-9d7d-3a7ab025437c with input: \n",
      "question: what was the last year where this team was a part of the usl a-league?\n",
      "table: |    |   Year |   Division | League              | Regular Season   | Playoffs        | Open Cup        | Avg. Attendance   |\n",
      "|---:|-------:|-----------:|:--------------------|:-----------------|:----...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module fa48cf7a-e2c2-47d7-994c-2e631e9c84b0 with input: \n",
      "messages: You are an advanced AI capable of analyzing and understanding information within tables. Read the table below.\n",
      "\n",
      "|    |   Year |   Division | League              | Regular Season   | Playoffs        | ...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module df320e1c-8268-4f9f-813d-4ce0d9322de6 with input: \n",
      "input: assistant: Step 1: Identify the rows where the team was a part of the USL A-League.\n",
      "- Row 0: Year 2001, Division 2, League USL A-League\n",
      "- Row 1: Year 2002, Division 2, League USL A-League\n",
      "- Row 2: Yea...\n",
      "\n",
      "\u001b[0mText results: ['2003']\n",
      "Symbolic results: []\n",
      "2003\n"
     ]
    }
   ],
   "source": [
    "response = await query_engine.aquery(example[\"utterance\"])\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We get an incorrect answer. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "query_engine = MixSelfConsistencyQueryEngine(\n",
    "    df=table,\n",
    "    llm=llm,\n",
    "    text_paths=5,\n",
    "    symbolic_paths=0,\n",
    "    aggregation_mode=\"self-consistency\",\n",
    "    verbose=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[1;3;38;2;155;135;227m> Running module 29953b3a-c85d-45f3-ad2e-1883a8b1ff64 with input: \n",
      "question: what was the last year where this team was a part of the usl a-league?\n",
      "table: |    |   Year |   Division | League              | Regular Season   | Playoffs        | Open Cup        | Avg. Attendance   |\n",
      "|---:|-------:|-----------:|:--------------------|:-----------------|:----...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module 0d48eb39-6d33-4174-a166-858e5f773870 with input: \n",
      "messages: You are an advanced AI capable of analyzing and understanding information within tables. Read the table below.\n",
      "\n",
      "|    |   Year |   Division | League              | Regular Season   | Playoffs        | ...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module 80a6c1b8-6e7a-4827-96f3-d6be664b4d85 with input: \n",
      "question: what was the last year where this team was a part of the usl a-league?\n",
      "table: |    |   Year |   Division | League              | Regular Season   | Playoffs        | Open Cup        | Avg. Attendance   |\n",
      "|---:|-------:|-----------:|:--------------------|:-----------------|:----...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module aa50e0fc-801c-4429-9ff9-7a6b83988cd9 with input: \n",
      "messages: You are an advanced AI capable of analyzing and understanding information within tables. Read the table below.\n",
      "\n",
      "|    |   Year |   Division | League              | Regular Season   | Playoffs        | ...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module a3b97076-31f6-4b19-9f97-dc1c04b5185d with input: \n",
      "question: what was the last year where this team was a part of the usl a-league?\n",
      "table: |    |   Year |   Division | League              | Regular Season   | Playoffs        | Open Cup        | Avg. Attendance   |\n",
      "|---:|-------:|-----------:|:--------------------|:-----------------|:----...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module ad2bc047-50ec-401c-a098-97eafe45c688 with input: \n",
      "messages: You are an advanced AI capable of analyzing and understanding information within tables. Read the table below.\n",
      "\n",
      "|    |   Year |   Division | League              | Regular Season   | Playoffs        | ...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module b506d936-55fb-4540-a37d-ca8e7cf587ce with input: \n",
      "question: what was the last year where this team was a part of the usl a-league?\n",
      "table: |    |   Year |   Division | League              | Regular Season   | Playoffs        | Open Cup        | Avg. Attendance   |\n",
      "|---:|-------:|-----------:|:--------------------|:-----------------|:----...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module db1ffb31-f2ea-445b-bc0b-1a44a10849f5 with input: \n",
      "messages: You are an advanced AI capable of analyzing and understanding information within tables. Read the table below.\n",
      "\n",
      "|    |   Year |   Division | League              | Regular Season   | Playoffs        | ...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module f556cbdc-38c6-4a34-8c75-fc17226f41ef with input: \n",
      "question: what was the last year where this team was a part of the usl a-league?\n",
      "table: |    |   Year |   Division | League              | Regular Season   | Playoffs        | Open Cup        | Avg. Attendance   |\n",
      "|---:|-------:|-----------:|:--------------------|:-----------------|:----...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module 983d97e1-f0ff-4fac-bb3e-e893c8fed21d with input: \n",
      "messages: You are an advanced AI capable of analyzing and understanding information within tables. Read the table below.\n",
      "\n",
      "|    |   Year |   Division | League              | Regular Season   | Playoffs        | ...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module 7c0cddbf-dc03-4b29-80f6-6fecfe00457d with input: \n",
      "input: assistant: Step 1: Analyze the table and identify the rows where the league is mentioned as \"USL A-League.\"\n",
      "\n",
      "From the table, we can see that the rows with indices 0, 1, and 2 mention the league as \"US...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module a77b76ac-bf79-4bcb-bef7-9a910cd3e7ce with input: \n",
      "input: assistant: To determine the last year that this team was a part of the USL A-League, we need to go through each row of the table and check the value in the \"League\" column.\n",
      "\n",
      "The team was a part of the...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module 47d62d6b-d8b3-4d77-8df1-9116b73bb2ad with input: \n",
      "input: assistant: To find the last year this team was a part of the USL A-League, we need to analyze the \"Division\" column in the table. \n",
      "\n",
      "Looking at the \"Division\" column, we can see that the team was a par...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module 3985073b-2f55-4a59-9d02-268f7885f7a1 with input: \n",
      "input: assistant: Step 1: Identify the rows where the team was a part of the USL A-League.\n",
      "Row 0: Year 2001, Division 2, League USL A-League\n",
      "Row 1: Year 2002, Division 2, League USL A-League\n",
      "Row 2: Year 2003...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module 57cde50d-6e80-488d-b6f6-459977efec91 with input: \n",
      "input: assistant: To find the last year in which this team was a part of the USL A-League, we need to look for the rows where the \"Division\" column has a value of 2 and the \"League\" column has a value of \"US...\n",
      "\n",
      "\u001b[0mAggregation mode: self-consistency\n",
      "Text results: ['2004', '2003', '2003', '2003', '2003']\n",
      "Symbolic results: []\n",
      "2003\n"
     ]
    }
   ],
   "source": [
    "response = await query_engine.aquery(example[\"utterance\"])\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We still get incorrect result after sampling 5 textua reasoning paths, and aggregating the results via self-consistency."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Symbolid Reasoning Only (i.e. python shell)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, let's use symbolic reasoning. Here, LLM generates python expressions directly manipuate the pandas dataframe."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "query_engine = MixSelfConsistencyQueryEngine(\n",
    "    df=table,\n",
    "    llm=llm,\n",
    "    text_paths=0,\n",
    "    symbolic_paths=1,\n",
    "    aggregation_mode=\"none\",\n",
    "    verbose=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "> Pandas Instructions:\n",
      "```\n",
      "df[df['League'] == 'USL A-League']['Year'].max()\n",
      "```\n",
      "> Pandas Output: 2004\n",
      "Aggregation mode: none\n",
      "Text results: []\n",
      "Symbolic results: ['2004']\n",
      "2004\n"
     ]
    }
   ],
   "source": [
    "response = await query_engine.aquery(example[\"utterance\"])\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We get the correct answer here. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Aggregation via Self-Evaluation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we consider self-evaluation for aggregating results across textual and symbolic reasoning paths.\n",
    "Basically, we tell the LLM what each reasoning path is good at to obtain a final result."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "query_engine = MixSelfConsistencyQueryEngine(\n",
    "    df=table,\n",
    "    llm=llm,\n",
    "    text_paths=1,\n",
    "    symbolic_paths=1,\n",
    "    aggregation_mode=\"self-evaluation\",\n",
    "    verbose=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[1;3;38;2;155;135;227m> Running module 213e182e-6352-4662-8a51-78099ad8fd7b with input: \n",
      "question: what was the last year where this team was a part of the usl a-league?\n",
      "table: |    |   Year |   Division | League              | Regular Season   | Playoffs        | Open Cup        | Avg. Attendance   |\n",
      "|---:|-------:|-----------:|:--------------------|:-----------------|:----...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module 1b7e4a6f-e459-4bac-8369-a07050a8a899 with input: \n",
      "messages: You are an advanced AI capable of analyzing and understanding information within tables. Read the table below.\n",
      "\n",
      "|    |   Year |   Division | League              | Regular Season   | Playoffs        | ...\n",
      "\n",
      "\u001b[0m> Pandas Instructions:\n",
      "```\n",
      "df[df['League'] == 'USL A-League']['Year'].max()\n",
      "```\n",
      "> Pandas Output: 2004\n",
      "\u001b[1;3;38;2;155;135;227m> Running module 1598e49c-0165-466f-b927-af6bcc712317 with input: \n",
      "input: assistant: Step 1: Analyze the table.\n",
      "\n",
      "The table provides information about a team's performance in different years, including the year, division, league, regular season performance, playoffs performa...\n",
      "\n",
      "\u001b[0mAggregation mode: self-evaluation\n",
      "Text results: ['2003']\n",
      "Symbolic results: ['2004']\n",
      "\u001b[1;3;38;2;155;135;227m> Running module f923280a-f299-4697-9287-f680fa731f69 with input: \n",
      "question: what was the last year where this team was a part of the usl a-league?\n",
      "table: |    |   Year |   Division | League              | Regular Season   | Playoffs        | Open Cup        | Avg. Attendance   |\n",
      "|---:|-------:|-----------:|:--------------------|:-----------------|:----...\n",
      "textual_answer: 2003\n",
      "symbolic_answer: 2004\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module 4266d9da-ae4f-45f6-8a47-ee0a1d983c85 with input: \n",
      "messages: Below is a markdown table:\n",
      "\n",
      "|    |   Year |   Division | League              | Regular Season   | Playoffs        | Open Cup        | Avg. Attendance   |\n",
      "|---:|-------:|-----------:|:-----------------...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module 8d12554c-2737-4772-a79b-0aa4fdb153e8 with input: \n",
      "input: assistant: Based on the evaluation process, let's proceed with the analysis:\n",
      "\n",
      "1. Preliminary Evaluation:\n",
      "   - Answer A: 2003. This directly addresses the question by providing the last year where the ...\n",
      "\n",
      "\u001b[0m2004\n"
     ]
    }
   ],
   "source": [
    "response = await query_engine.aquery(example[\"utterance\"])\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We obtain the correct result now."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Aggregation via Mix Self-Consistency"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we consider the SoTA method, which aggregates results across reasoning paths via self-consistency (i.e. majority voting) "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "query_engine = MixSelfConsistencyQueryEngine(\n",
    "    df=table,\n",
    "    llm=llm,\n",
    "    text_paths=5,\n",
    "    symbolic_paths=5,\n",
    "    aggregation_mode=\"self-consistency\",\n",
    "    verbose=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[1;3;38;2;155;135;227m> Running module 101db377-8822-49aa-9d50-2c7020e08c39 with input: \n",
      "question: what was the last year where this team was a part of the usl a-league?\n",
      "table: |    |   Year |   Division | League              | Regular Season   | Playoffs        | Open Cup        | Avg. Attendance   |\n",
      "|---:|-------:|-----------:|:--------------------|:-----------------|:----...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module dfd8b57f-1839-4d44-964f-1eefee7356be with input: \n",
      "messages: You are an advanced AI capable of analyzing and understanding information within tables. Read the table below.\n",
      "\n",
      "|    |   Year |   Division | League              | Regular Season   | Playoffs        | ...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module d69d56f1-8df3-4c95-8a76-8dea842c855b with input: \n",
      "question: what was the last year where this team was a part of the usl a-league?\n",
      "table: |    |   Year |   Division | League              | Regular Season   | Playoffs        | Open Cup        | Avg. Attendance   |\n",
      "|---:|-------:|-----------:|:--------------------|:-----------------|:----...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module 15e0e2c5-9a35-406d-9b11-819799523c66 with input: \n",
      "messages: You are an advanced AI capable of analyzing and understanding information within tables. Read the table below.\n",
      "\n",
      "|    |   Year |   Division | League              | Regular Season   | Playoffs        | ...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module a05ba6eb-1522-418c-874f-beeee209f569 with input: \n",
      "question: what was the last year where this team was a part of the usl a-league?\n",
      "table: |    |   Year |   Division | League              | Regular Season   | Playoffs        | Open Cup        | Avg. Attendance   |\n",
      "|---:|-------:|-----------:|:--------------------|:-----------------|:----...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module eb9c101a-7eb3-48a6-a569-cd3465696e2c with input: \n",
      "messages: You are an advanced AI capable of analyzing and understanding information within tables. Read the table below.\n",
      "\n",
      "|    |   Year |   Division | League              | Regular Season   | Playoffs        | ...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module 23c947ca-50b0-45ce-995e-09ddc5d9404a with input: \n",
      "question: what was the last year where this team was a part of the usl a-league?\n",
      "table: |    |   Year |   Division | League              | Regular Season   | Playoffs        | Open Cup        | Avg. Attendance   |\n",
      "|---:|-------:|-----------:|:--------------------|:-----------------|:----...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module cfe16e45-1221-4dfa-a6f9-03426e9681fb with input: \n",
      "messages: You are an advanced AI capable of analyzing and understanding information within tables. Read the table below.\n",
      "\n",
      "|    |   Year |   Division | League              | Regular Season   | Playoffs        | ...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module acac110b-40a7-4179-868f-0f32a32b7583 with input: \n",
      "question: what was the last year where this team was a part of the usl a-league?\n",
      "table: |    |   Year |   Division | League              | Regular Season   | Playoffs        | Open Cup        | Avg. Attendance   |\n",
      "|---:|-------:|-----------:|:--------------------|:-----------------|:----...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module da86d1e7-1fa4-444b-8164-69ae22283092 with input: \n",
      "messages: You are an advanced AI capable of analyzing and understanding information within tables. Read the table below.\n",
      "\n",
      "|    |   Year |   Division | League              | Regular Season   | Playoffs        | ...\n",
      "\n",
      "\u001b[0m> Pandas Instructions:\n",
      "```\n",
      "df[df['League'] == 'USL A-League']['Year'].max()\n",
      "```\n",
      "> Pandas Output: 2004\n",
      "> Pandas Instructions:\n",
      "```\n",
      "df[df['League'] == 'USL A-League']['Year'].max()\n",
      "```\n",
      "> Pandas Output: 2004\n",
      "> Pandas Instructions:\n",
      "```\n",
      "df[df['League'] == 'USL A-League']['Year'].max()\n",
      "```\n",
      "> Pandas Output: 2004\n",
      "> Pandas Instructions:\n",
      "```\n",
      "df[df['League'] == 'USL A-League']['Year'].max()\n",
      "```\n",
      "> Pandas Output: 2004\n",
      "> Pandas Instructions:\n",
      "```\n",
      "df[df['League'] == 'USL A-League']['Year'].max()\n",
      "```\n",
      "> Pandas Output: 2004\n",
      "\u001b[1;3;38;2;155;135;227m> Running module 8d395cd7-7c37-4589-9dc6-29f892e6caa8 with input: \n",
      "input: assistant: Step 1: Analyze the table to find the last year where the team was a part of the USL A-League.\n",
      "\n",
      "From the table, we can see that the team was a part of the USL A-League in the years 2001, 20...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module 34a76ade-7342-4e27-ad42-efacbb9a93b9 with input: \n",
      "input: assistant: To determine the last year in which this team was a part of the USL A-League, we need to scan the \"League\" column in the table. Looking at the table, we can see that the team was a part of ...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module b868748b-3910-4c00-984d-cc938185f80b with input: \n",
      "input: assistant: Step 1: Identify the rows where the team was a part of the USL A-League.\n",
      "- In row 0, the team was a part of the USL A-League in the year 2001.\n",
      "- In row 1, the team was a part of the USL A-L...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module cb218681-3905-4dea-b995-74ccd3d155e2 with input: \n",
      "input: assistant: To determine the last year in which this team was a part of the USL A-League, we need to analyze the \"League\" column in the table. Let's go step by step:\n",
      "\n",
      "- In row 0, the league is \"USL A-L...\n",
      "\n",
      "\u001b[0m\u001b[1;3;38;2;155;135;227m> Running module 3b57f1d9-b988-45e0-b228-f48747e42055 with input: \n",
      "input: assistant: Step 1: Identify the rows where the \"League\" column mentions \"USL A-League.\"\n",
      "\n",
      "|    |   Year |   Division | League              | Regular Season   | Playoffs        | Open Cup        | Avg. ...\n",
      "\n",
      "\u001b[0mAggregation mode: self-consistency\n",
      "Text results: ['2003', '2003', '2004', '2003', '2003']\n",
      "Symbolic results: ['2004', '2004', '2004', '2004', '2004']\n",
      "2004\n"
     ]
    }
   ],
   "source": [
    "response = await query_engine.aquery(example[\"utterance\"])\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
